Skip to main content

Beyond Volume Pattern: Storage-Efficient Boolean Searchable Symmetric Encryption with Suppressed Leakage

  • Conference paper
  • First Online:
Computer Security – ESORICS 2023 (ESORICS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14344))

Included in the following conference series:

  • 344 Accesses

Abstract

Boolean Searchable Symmetric Encryption (BSSE) enables users to perform retrieval operations on the encrypted data while supporting complex query capabilities. This paper focuses on addressing the storage overhead and privacy concerns associated with existing BSSE schemes. While Patel et al. (ASIACRYPT’21) and Bag et al. (PETS’23) introduced BSSE schemes that conceal the number of single keyword results, both of them suffer from quadratic storage overhead and neglect the privacy of search and access patterns. Consequently, an open question arises: Can we design a storage-efficient Boolean query scheme that effectively suppresses leakage, covering not only the volume pattern for singleton keywords, but also search and access patterns?

In light of the limitations of existing schemes in terms of storage overhead and privacy protection, this work presents a novel solution called SESAME. It realizes efficient storage and privacy preserving based on Bloom filter and functional encryption. Moreover, we propose an enhanced version, SESAME+, which offers improved search performance. By rigorous security analysis on the leakage functions of our schemes, we provide a formal security proof. Finally, we implement our schemes and demonstrate that SESAME+ achieves superior search efficiency and reduced storage overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The data owner and the data user can be the same entity.

  2. 2.

    SESAME implies a mystical code that unlocks the treasure.

  3. 3.

    Representing all encrypted vectors as a matrix is a matter of convenience for notation purposes, and the actual computation still relies on the inner product operation of vectors.

  4. 4.

    Similarly, it is represented as a matrix solely for descriptive purposes.

  5. 5.

    Adaptive security denotes that the adversary can issue queries depending on previous queries, whereas non-adaptive security means that the adversary must prepare all the queries at the beginning of the BSSE security game.

  6. 6.

    In this paper, unless explicitly specified, \(\textsf {TWINSSE}_\textsf {OXT}\) is used to represent a scheme specifically designed for processing Boolean queries in CNF form.

References

  1. Enron Email Dataset. https://www.cs.cmu.edu/~enron/. Accessed May 2015

  2. PyCryptodome. https://pycryptodome.readthedocs.io/en/latest/index.html

  3. The Pairing-Based Cryptography Library. https://crypto.stanford.edu/pbc/

  4. Abdalla, M., Bourse, F., De Caro, A., Pointcheval, D.: Simple functional encryption schemes for inner products. In: Katz, J. (ed.) PKC 2015. LNCS, vol. 9020, pp. 733–751. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46447-2_33

    Chapter  Google Scholar 

  5. Agrawal, S., Libert, B., Stehlé, D.: Fully secure functional encryption for inner products, from standard assumptions. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9816, pp. 333–362. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53015-3_12

    Chapter  Google Scholar 

  6. Bag, A., Talapatra, D., et al.: TWo-IN-one-SSE: fast, scalable and storage-efficient searchable symmetric encryption for conjunctive and disjunctive boolean queries. Proc. Priv. Enhancing Technol. 2023(1), 115–139 (2023)

    Article  Google Scholar 

  7. Bishop, A., Jain, A., Kowalczyk, L.: Function-hiding inner product encryption. In: Iwata, T., Cheon, J.H. (eds.) ASIACRYPT 2015. LNCS, vol. 9452, pp. 470–491. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48797-6_20

    Chapter  Google Scholar 

  8. Boneh, D., Sahai, A., Waters, B.: Functional encryption: definitions and challenges. In: Ishai, Y. (ed.) TCC 2011. LNCS, vol. 6597, pp. 253–273. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19571-6_16

    Chapter  Google Scholar 

  9. Bost, R.: \(\sum \)o\(\varphi \)o\(\varsigma \): forward secure searchable encryption. In: 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS 2016, Vienna, Austria, pp. 1143–1154. ACM (2016)

    Google Scholar 

  10. Bost, R., Minaud, B., et al.: Forward and backward private searchable encryption from constrained cryptographic primitives. In: 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1465–1482. ACM (2017)

    Google Scholar 

  11. Cao, N., et al.: Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 25(1), 222–233 (2014)

    Article  Google Scholar 

  12. Cash, D., Jarecki, S., Jutla, C., Krawczyk, H., Roşu, M.-C., Steiner, M.: Highly-scalable searchable symmetric encryption with support for Boolean queries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 353–373. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_20

    Chapter  Google Scholar 

  13. Cash, D., et al.: Dynamic searchable encryption in very-large databases: data structures and implementation. In: 21st Annual Network and Distributed System Security Symposium, NDSS 2014. The Internet Society (2014)

    Google Scholar 

  14. Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. In: 2006 ACM Conference on Computer and Communications Security, CCS 2006, pp. 79–88. ACM (2006)

    Google Scholar 

  15. Demertzis, I., Papadopoulos, D., Papamanthou, C.: Searchable encryption with optimal locality: achieving sublogarithmic read efficiency. In: Shacham, H., Boldyreva, A. (eds.) CRYPTO 2018. LNCS, vol. 10991, pp. 371–406. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96884-1_13

    Chapter  Google Scholar 

  16. Fu, Z., Huang, F., Ren, K., Weng, J., Wang, C.: Privacy-preserving smart semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans. Inf. Forensics Secur. 12(8), 1874–1884 (2017)

    Article  Google Scholar 

  17. Grubbs, P., Lacharité, M.-S., Minaud, B., Paterson, K.G.: Pump up the volume: practical database reconstruction from volume leakage on range queries. In: 2018 ACM Conference on Computer and Communications Security, CCS 2018, pp. 315–331. ACM (2018)

    Google Scholar 

  18. Gui, Z., Johnson, O., Warinschi, B.: Encrypted databases: new volume attacks against range queries. In: 2019 ACM Conference on Computer and Communications Security, CCS 2019, pp. 361–378. ACM (2019)

    Google Scholar 

  19. Islam, M.S., Kuzu, M., et al.: Access pattern disclosure on searchable encryption: ramification, attack and mitigation. In: 19th Annual Network and Distributed System Security Symposium, NDSS 2012, p. 12. The Internet Society (2012)

    Google Scholar 

  20. Kamara, S., Papamanthou, C., Roeder, T.: Dynamic searchable symmetric encryption. In: 2012 ACM Conference on Computer and Communications Security, CCS 2012, pp. 965–976. ACM (2012)

    Google Scholar 

  21. Kamara, S., Moataz, T.: Boolean searchable symmetric encryption with worst-case sub-linear complexity. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10212, pp. 94–124. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56617-7_4

    Chapter  Google Scholar 

  22. Kellaris, G., Kollios, G., Nissim, K., O’Neill, A.: Generic attacks on secure outsourced databases. In: 2016 ACM Conference on Computer and Communications Security, CCS 2016, pp. 1329–1340. ACM (2016)

    Google Scholar 

  23. Kim, S., Lewi, K., Mandal, A., Montgomery, H., Roy, A., Wu, D.J.: Function-hiding inner product encryption is practical. In: Catalano, D., De Prisco, R. (eds.) SCN 2018. LNCS, vol. 11035, pp. 544–562. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98113-0_29

    Chapter  Google Scholar 

  24. Kornaropoulos, E.M., Papamanthou, C., Tamassia, R.: The state of the uniform: attacks on encrypted databases beyond the uniform query distribution. In: 2020 IEEE Symposium on Security and Privacy, S &P 2020, pp. 1223–1240. IEEE (2020)

    Google Scholar 

  25. Lai, S., Patranabis, S., et al.: Result pattern hiding searchable encryption for conjunctive queries. In: 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, pp. 745–762. ACM (2018)

    Google Scholar 

  26. Liu, C., Zhu, L., Wang, M., Tan, Y.: Search pattern leakage in searchable encryption: attacks and new construction. Inf. Sci. 265, 176–188 (2014)

    Article  Google Scholar 

  27. Ning, J., Xu, J., Liang, K., Zhang, F., Chang, E.-C.: Passive attacks against searchable encryption. IEEE Trans. Inf. Forensics Secur. 14(3), 789–802 (2019)

    Article  Google Scholar 

  28. Oya, S., Kerschbaum, F.: Hiding the access pattern is not enough: exploiting search pattern leakage in searchable encryption. In: 30th USENIX Security Symposium, USENIX Security 2021, pp. 127–142. USENIX Association (2021)

    Google Scholar 

  29. Patel, S., Persiano, G., Seo, J.Y., Yeo, K.: Efficient Boolean search over encrypted data with reduced leakage. In: Tibouchi, M., Wang, H. (eds.) ASIACRYPT 2021. LNCS, vol. 13092, pp. 577–607. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92078-4_20

    Chapter  Google Scholar 

  30. Pouliot, D., Wright, C.V.: The shadow nemesis: inference attacks on efficiently deployable, efficiently searchable encryption. In: 2016 ACM Conference on Computer and Communications Security, CCS 2016, pp. 1341–1352. ACM (2016)

    Google Scholar 

  31. Shang, Z., Oya, S., Peter, A., Kerschbaum, F.: Obfuscated access and search patterns in searchable encryption. In: 28th Annual Network and Distributed System Security Symposium, NDSS 2021. The Internet Society (2021)

    Google Scholar 

  32. Song, D.X., Wagner, D.A., Perring A.: Practical techniques for searches on encrypted data. In: 2000 IEEE Symposium on Security and Privacy, S &P 2000, pp. 44–55. IEEE Computer Society (2000)

    Google Scholar 

  33. Wang, B., Yu, S., Lou, W., Hou, Y.T.: Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: 2014 IEEE Conference on Computer Communications, INFOCOM 2014, pp. 2112–2120. IEEE (2014)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by the National Key Research and Development Program of China under Grant No. 2021YFB3101100; in part by China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangfu Song .

Editor information

Editors and Affiliations

Appendix A Proof of Theorem 1

Appendix A Proof of Theorem 1

We provide a formal security proof of our construction SESAME+. We consider a database DB and a sequence of DNF queries \(\mathcal {Q} = \{Q_1, \cdots , Q_n\}\), where \(Q_i = q_{i,1} \vee \cdots \vee q_{i,m}\) consists of m conjunctions.

The leakage function \(\mathcal {L}_{\textsf {Setup}}\) captures information that is leaked from the Setup algorithm. In our construction, we use Bloom filters to represent documents and encrypt them using functional encryption. As the adversary is restricted to access only the encrypted vectors, the acquired information is confined to the total number of encrypted vectors and their respective lengths, represented as d and l, respectively. Hence, the Setup leakage function is defined as \(\mathcal {L}_{\textsf {Setup}} = (d, l)\).

The leakage function \(\mathcal {L}_{\textsf {Token}}\) is a summary of the information that an adversary can acquire in the context of the Token algorithm. It is noteworthy that both the vector \(\boldsymbol{\alpha }\), which records the number of non-zero elements, and the vector \(\boldsymbol{\beta }\), which records the positions of non-zero elements, are sent to the server as auxiliary query information, thereby making them susceptible to the adversary. Additionally, \(\boldsymbol{U}\) can be derived from \(\boldsymbol{\beta }\), which means that it is not part of \(\mathcal {L}_{\textsf {Token}}\). Furthermore, \(\boldsymbol{\beta }\) discloses the number of clauses in the query Q as m. Consequently, the Token leakage function is represented as \(\mathcal {L}_{\textsf {Token}} = (m, \boldsymbol{\alpha }, \boldsymbol{\beta })\).

Regarding the information that is leaked in the Search algorithm, it is important to note that the output from the Token is received by the server, and this output has already been included in the \(\mathcal {L}_{\textsf {Token}}\). During query execution, the server prunes the matrix \(\boldsymbol{A}\) based on \(\beta _i\) to derive \(\boldsymbol{A}'\) for each clause in Q, where \(\boldsymbol{A}\) represents the ciphertext vectors encrypted by functional encryption generated in the Setup, and its security is guaranteed by functional encryption. Subsequently, the server decrypts \(\boldsymbol{A}'\) to obtain the inner product result \(\boldsymbol{r}_i\), which can be acquired by the adversary. Additionally, the server discloses the query’s result set \(\mathcal {R}\), which constitutes information accessible to the adversary. Therefore, the Search leakage function is defined as \(\mathcal {L}_{\textsf {Search}} = (\{\boldsymbol{r}_1, \cdots , \boldsymbol{r}_{m}\}, \mathcal {R})\).

Proof

To demonstrate that \({\textbf {Real}}_{\mathcal {A}}^{{\textsf {SESAME+}}}(\lambda )\) and \({\textbf {Ideal}}_{\mathcal {A,S}}^{{\textsf {SESAME+}}}(\lambda )\) are computationally indistinguishable, we characterize a probabilistic polynomial-time simulator \(\mathcal {S}\) capable of simulating the three protocols in our SESAME+ scheme. The simulator \(\mathcal {S}\) must be able to regenerate the encrypted database and tokens from the leakage information \(\mathcal {L}\), with the regenerated tokens satisfying the dependencies among the leakage functions \(\mathcal {L}_{\textsf {Setup}}\), \(\mathcal {L}_{\textsf {Token}}\), and \(\mathcal {L}_{\textsf {Search}}\), in order to prevent the adversary \(\mathcal {A}\) from distinguishing between the real world and ideal world scenarios. The adversary \(\mathcal {A}\) has access to the simulated encrypted database and can retrieve data using the simulated tokens.

Provided the leakage information \(\mathcal {L} = (\mathcal {L}_{\textsf {Setup}}, \mathcal {L}_{\textsf {Token}}, \mathcal {L}_{\textsf {Search}})\), the simulations can be formulated as follows:

To simulate the Setup protocol, \(\mathcal {S}\) selects a cyclic group \(\mathbb {G}\) of prime order \(p > 2^\lambda \). Then, \(\mathcal {S}\) randomly samples \(s_i, t_i \leftarrow \mathbb {Z}_p\) for each \(i \in \{1, \cdots , l\}\), where l is determined by \(\mathcal {L}_{\textsf {Setup}}\), randomly samples \(k \leftarrow \{0,1\}^\lambda \) and computes \(h_i = g^{s_i} \cdot h^{t_i}\), where g and h are two randomly generated generators in \(\mathbb {G}\). As a result, \(\mathcal {S}\) simulates the master secret key and master public key as \(\textsf {msk} := (\textsf {sk}_{\textsf {IPFE}}=\{(s_i, t_i)\}^l_{i=1}, k)\) and \(\textsf {mpk} := (\mathbb {G}, g, h, \{h_i\}_{i=1}^{l})\), respectively.

For simulating the EDB, \(\mathcal {S}\) generates d Bloom filters \(\boldsymbol{v}_i\) of length l. These vectors are constructed to maintain dependencies with the leakage functions \(\mathcal {L}_{\textsf {Token}}\) and \(\mathcal {L}_{\textsf {Search}}\), ensuring that the adversary’s verification using simulated tokens remains valid. The adversary can only learn the length l of the vectors and the number of vectors d, as they only have access to the encrypted vectors. Finally, the simulator \(\mathcal {S}\) employs functional encryption for inner product with the mpk to encrypt the vectors and simulate the encrypted database EDB.

In the context of the Setup protocol, given the leakage information \(\mathcal {L}\), the simulator \(\mathcal {S}\) generates simulated outputs, including the encrypted database EDB, the master public key mpk, and the master secret key msk. The difference between the simulated EDB and the real-world scenario lies in the selection of \(\boldsymbol{v}_i\). Instead of obtaining \(\boldsymbol{v}_i\) based on the document mapping, \(\mathcal {S}\) selects \(\boldsymbol{v}_i\) using the leakage functions \(\mathcal {L}_{\textsf {Token}}\) and \(\mathcal {L}_{\textsf {Search}}\), followed by its encryption. The advantage of distinguishing them is negligible if functional encryption is fully secure. The simulations of the mpk and msk are equivalent with those of the real world.

In the simulation of the Token protocol, \(\mathcal {S}\) simulates tokens for Boolean queries based on the leakage function \(\mathcal {L}_{\textsf {Token}}\) and ensures that these tokens can operate on the simulated encrypted database EDB. The leakage information provided by \(\mathcal {L}_{\textsf {Token}}\) reveals the positions of non-zero elements in the vector for each Boolean query clause, as well as the number of clauses for each Boolean query. Consequently, \(\mathcal {S}\) can generate tokens that are identical to those in the real experiment. For simulating the decryption key, \(\mathcal {S}\) leverages the leaked positions information to simulate the decryption key using the Keygen algorithm of functional encryption. The advantage of \(\mathcal {A}\) in distinguishing between the real world and the ideal world becomes negligible if the functional encryption is secure.

When simulating the Search protocol, \(\mathcal {S}\) retrieves documents from the encrypted database EDB based on a given Boolean query. Upon receiving the simulated token tok, \(\mathcal {S}\) prunes the simulated EDB according to the corresponding \(\boldsymbol{\beta }\), then performs the decryption process on the pruned EDB to obtain the identifiers of documents that satisfy the query. Since both EDB and tok are simulated based on the leakage function \(\mathcal {L}\), the search process performed on the simulated token leaks the same information as \(\mathcal {L}_{\textsf {Search}}\). Consequently, \(\mathcal {A}\) cannot distinguish between the real world and the ideal world with more than negligible probability.

In the above proof, we describe a probabilistic polynomial-time simulator \(\mathcal {S}\) that simulates the real experiment by using a given leakage information from \(\mathcal {L}\). Assuming that functional encryption for inner product is secure, then our scheme SESAME+ achieves \(\mathcal {L}\)-secure, that is

$$|\text {Pr}[{\textbf {Real}}_{\mathcal {A}}^{{\textsf {SESAME}+}}(\lambda ) = 1] - \text {Pr}[{\textbf {Ideal}}_{\mathcal {A}, \mathcal {S}}^{{\textsf {SESAME}+}}(\lambda ) = 1]|\le \textsf {negl}(\lambda ).$$

Remark. Due to subtle issues from the underlying inner product functional encryption, we prove \(\textsf {SESAME}+\) with non-adaptive security, i.e., the adversary issues all queries before running the game. Designing an adaptively secure BSSE scheme with similar properties as \(\textsf {SESAME}+\) seems to require fundamentally different primitives and proof techniques, for which we leave as a future work.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, F., Ma, J., Miao, Y., Wu, P., Song, X. (2024). Beyond Volume Pattern: Storage-Efficient Boolean Searchable Symmetric Encryption with Suppressed Leakage. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14344. Springer, Cham. https://doi.org/10.1007/978-3-031-50594-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50594-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50593-5

  • Online ISBN: 978-3-031-50594-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics