Abstract
Boolean Searchable Symmetric Encryption (BSSE) enables users to perform retrieval operations on the encrypted data while supporting complex query capabilities. This paper focuses on addressing the storage overhead and privacy concerns associated with existing BSSE schemes. While Patel et al. (ASIACRYPT’21) and Bag et al. (PETS’23) introduced BSSE schemes that conceal the number of single keyword results, both of them suffer from quadratic storage overhead and neglect the privacy of search and access patterns. Consequently, an open question arises: Can we design a storage-efficient Boolean query scheme that effectively suppresses leakage, covering not only the volume pattern for singleton keywords, but also search and access patterns?
In light of the limitations of existing schemes in terms of storage overhead and privacy protection, this work presents a novel solution called SESAME. It realizes efficient storage and privacy preserving based on Bloom filter and functional encryption. Moreover, we propose an enhanced version, SESAME+, which offers improved search performance. By rigorous security analysis on the leakage functions of our schemes, we provide a formal security proof. Finally, we implement our schemes and demonstrate that SESAME+ achieves superior search efficiency and reduced storage overhead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The data owner and the data user can be the same entity.
- 2.
SESAME implies a mystical code that unlocks the treasure.
- 3.
Representing all encrypted vectors as a matrix is a matter of convenience for notation purposes, and the actual computation still relies on the inner product operation of vectors.
- 4.
Similarly, it is represented as a matrix solely for descriptive purposes.
- 5.
Adaptive security denotes that the adversary can issue queries depending on previous queries, whereas non-adaptive security means that the adversary must prepare all the queries at the beginning of the BSSE security game.
- 6.
In this paper, unless explicitly specified, \(\textsf {TWINSSE}_\textsf {OXT}\) is used to represent a scheme specifically designed for processing Boolean queries in CNF form.
References
Enron Email Dataset. https://www.cs.cmu.edu/~enron/. Accessed May 2015
PyCryptodome. https://pycryptodome.readthedocs.io/en/latest/index.html
The Pairing-Based Cryptography Library. https://crypto.stanford.edu/pbc/
Abdalla, M., Bourse, F., De Caro, A., Pointcheval, D.: Simple functional encryption schemes for inner products. In: Katz, J. (ed.) PKC 2015. LNCS, vol. 9020, pp. 733–751. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46447-2_33
Agrawal, S., Libert, B., Stehlé, D.: Fully secure functional encryption for inner products, from standard assumptions. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9816, pp. 333–362. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53015-3_12
Bag, A., Talapatra, D., et al.: TWo-IN-one-SSE: fast, scalable and storage-efficient searchable symmetric encryption for conjunctive and disjunctive boolean queries. Proc. Priv. Enhancing Technol. 2023(1), 115–139 (2023)
Bishop, A., Jain, A., Kowalczyk, L.: Function-hiding inner product encryption. In: Iwata, T., Cheon, J.H. (eds.) ASIACRYPT 2015. LNCS, vol. 9452, pp. 470–491. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48797-6_20
Boneh, D., Sahai, A., Waters, B.: Functional encryption: definitions and challenges. In: Ishai, Y. (ed.) TCC 2011. LNCS, vol. 6597, pp. 253–273. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19571-6_16
Bost, R.: \(\sum \)o\(\varphi \)o\(\varsigma \): forward secure searchable encryption. In: 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS 2016, Vienna, Austria, pp. 1143–1154. ACM (2016)
Bost, R., Minaud, B., et al.: Forward and backward private searchable encryption from constrained cryptographic primitives. In: 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1465–1482. ACM (2017)
Cao, N., et al.: Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 25(1), 222–233 (2014)
Cash, D., Jarecki, S., Jutla, C., Krawczyk, H., Roşu, M.-C., Steiner, M.: Highly-scalable searchable symmetric encryption with support for Boolean queries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 353–373. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_20
Cash, D., et al.: Dynamic searchable encryption in very-large databases: data structures and implementation. In: 21st Annual Network and Distributed System Security Symposium, NDSS 2014. The Internet Society (2014)
Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. In: 2006 ACM Conference on Computer and Communications Security, CCS 2006, pp. 79–88. ACM (2006)
Demertzis, I., Papadopoulos, D., Papamanthou, C.: Searchable encryption with optimal locality: achieving sublogarithmic read efficiency. In: Shacham, H., Boldyreva, A. (eds.) CRYPTO 2018. LNCS, vol. 10991, pp. 371–406. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96884-1_13
Fu, Z., Huang, F., Ren, K., Weng, J., Wang, C.: Privacy-preserving smart semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans. Inf. Forensics Secur. 12(8), 1874–1884 (2017)
Grubbs, P., Lacharité, M.-S., Minaud, B., Paterson, K.G.: Pump up the volume: practical database reconstruction from volume leakage on range queries. In: 2018 ACM Conference on Computer and Communications Security, CCS 2018, pp. 315–331. ACM (2018)
Gui, Z., Johnson, O., Warinschi, B.: Encrypted databases: new volume attacks against range queries. In: 2019 ACM Conference on Computer and Communications Security, CCS 2019, pp. 361–378. ACM (2019)
Islam, M.S., Kuzu, M., et al.: Access pattern disclosure on searchable encryption: ramification, attack and mitigation. In: 19th Annual Network and Distributed System Security Symposium, NDSS 2012, p. 12. The Internet Society (2012)
Kamara, S., Papamanthou, C., Roeder, T.: Dynamic searchable symmetric encryption. In: 2012 ACM Conference on Computer and Communications Security, CCS 2012, pp. 965–976. ACM (2012)
Kamara, S., Moataz, T.: Boolean searchable symmetric encryption with worst-case sub-linear complexity. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10212, pp. 94–124. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56617-7_4
Kellaris, G., Kollios, G., Nissim, K., O’Neill, A.: Generic attacks on secure outsourced databases. In: 2016 ACM Conference on Computer and Communications Security, CCS 2016, pp. 1329–1340. ACM (2016)
Kim, S., Lewi, K., Mandal, A., Montgomery, H., Roy, A., Wu, D.J.: Function-hiding inner product encryption is practical. In: Catalano, D., De Prisco, R. (eds.) SCN 2018. LNCS, vol. 11035, pp. 544–562. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98113-0_29
Kornaropoulos, E.M., Papamanthou, C., Tamassia, R.: The state of the uniform: attacks on encrypted databases beyond the uniform query distribution. In: 2020 IEEE Symposium on Security and Privacy, S &P 2020, pp. 1223–1240. IEEE (2020)
Lai, S., Patranabis, S., et al.: Result pattern hiding searchable encryption for conjunctive queries. In: 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, pp. 745–762. ACM (2018)
Liu, C., Zhu, L., Wang, M., Tan, Y.: Search pattern leakage in searchable encryption: attacks and new construction. Inf. Sci. 265, 176–188 (2014)
Ning, J., Xu, J., Liang, K., Zhang, F., Chang, E.-C.: Passive attacks against searchable encryption. IEEE Trans. Inf. Forensics Secur. 14(3), 789–802 (2019)
Oya, S., Kerschbaum, F.: Hiding the access pattern is not enough: exploiting search pattern leakage in searchable encryption. In: 30th USENIX Security Symposium, USENIX Security 2021, pp. 127–142. USENIX Association (2021)
Patel, S., Persiano, G., Seo, J.Y., Yeo, K.: Efficient Boolean search over encrypted data with reduced leakage. In: Tibouchi, M., Wang, H. (eds.) ASIACRYPT 2021. LNCS, vol. 13092, pp. 577–607. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92078-4_20
Pouliot, D., Wright, C.V.: The shadow nemesis: inference attacks on efficiently deployable, efficiently searchable encryption. In: 2016 ACM Conference on Computer and Communications Security, CCS 2016, pp. 1341–1352. ACM (2016)
Shang, Z., Oya, S., Peter, A., Kerschbaum, F.: Obfuscated access and search patterns in searchable encryption. In: 28th Annual Network and Distributed System Security Symposium, NDSS 2021. The Internet Society (2021)
Song, D.X., Wagner, D.A., Perring A.: Practical techniques for searches on encrypted data. In: 2000 IEEE Symposium on Security and Privacy, S &P 2000, pp. 44–55. IEEE Computer Society (2000)
Wang, B., Yu, S., Lou, W., Hou, Y.T.: Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: 2014 IEEE Conference on Computer Communications, INFOCOM 2014, pp. 2112–2120. IEEE (2014)
Acknowledgement
This work was supported in part by the National Key Research and Development Program of China under Grant No. 2021YFB3101100; in part by China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A Proof of Theorem 1
Appendix A Proof of Theorem 1
We provide a formal security proof of our construction SESAME+. We consider a database DB and a sequence of DNF queries \(\mathcal {Q} = \{Q_1, \cdots , Q_n\}\), where \(Q_i = q_{i,1} \vee \cdots \vee q_{i,m}\) consists of m conjunctions.
The leakage function \(\mathcal {L}_{\textsf {Setup}}\) captures information that is leaked from the Setup algorithm. In our construction, we use Bloom filters to represent documents and encrypt them using functional encryption. As the adversary is restricted to access only the encrypted vectors, the acquired information is confined to the total number of encrypted vectors and their respective lengths, represented as d and l, respectively. Hence, the Setup leakage function is defined as \(\mathcal {L}_{\textsf {Setup}} = (d, l)\).
The leakage function \(\mathcal {L}_{\textsf {Token}}\) is a summary of the information that an adversary can acquire in the context of the Token algorithm. It is noteworthy that both the vector \(\boldsymbol{\alpha }\), which records the number of non-zero elements, and the vector \(\boldsymbol{\beta }\), which records the positions of non-zero elements, are sent to the server as auxiliary query information, thereby making them susceptible to the adversary. Additionally, \(\boldsymbol{U}\) can be derived from \(\boldsymbol{\beta }\), which means that it is not part of \(\mathcal {L}_{\textsf {Token}}\). Furthermore, \(\boldsymbol{\beta }\) discloses the number of clauses in the query Q as m. Consequently, the Token leakage function is represented as \(\mathcal {L}_{\textsf {Token}} = (m, \boldsymbol{\alpha }, \boldsymbol{\beta })\).
Regarding the information that is leaked in the Search algorithm, it is important to note that the output from the Token is received by the server, and this output has already been included in the \(\mathcal {L}_{\textsf {Token}}\). During query execution, the server prunes the matrix \(\boldsymbol{A}\) based on \(\beta _i\) to derive \(\boldsymbol{A}'\) for each clause in Q, where \(\boldsymbol{A}\) represents the ciphertext vectors encrypted by functional encryption generated in the Setup, and its security is guaranteed by functional encryption. Subsequently, the server decrypts \(\boldsymbol{A}'\) to obtain the inner product result \(\boldsymbol{r}_i\), which can be acquired by the adversary. Additionally, the server discloses the query’s result set \(\mathcal {R}\), which constitutes information accessible to the adversary. Therefore, the Search leakage function is defined as \(\mathcal {L}_{\textsf {Search}} = (\{\boldsymbol{r}_1, \cdots , \boldsymbol{r}_{m}\}, \mathcal {R})\).
Proof
To demonstrate that \({\textbf {Real}}_{\mathcal {A}}^{{\textsf {SESAME+}}}(\lambda )\) and \({\textbf {Ideal}}_{\mathcal {A,S}}^{{\textsf {SESAME+}}}(\lambda )\) are computationally indistinguishable, we characterize a probabilistic polynomial-time simulator \(\mathcal {S}\) capable of simulating the three protocols in our SESAME+ scheme. The simulator \(\mathcal {S}\) must be able to regenerate the encrypted database and tokens from the leakage information \(\mathcal {L}\), with the regenerated tokens satisfying the dependencies among the leakage functions \(\mathcal {L}_{\textsf {Setup}}\), \(\mathcal {L}_{\textsf {Token}}\), and \(\mathcal {L}_{\textsf {Search}}\), in order to prevent the adversary \(\mathcal {A}\) from distinguishing between the real world and ideal world scenarios. The adversary \(\mathcal {A}\) has access to the simulated encrypted database and can retrieve data using the simulated tokens.
Provided the leakage information \(\mathcal {L} = (\mathcal {L}_{\textsf {Setup}}, \mathcal {L}_{\textsf {Token}}, \mathcal {L}_{\textsf {Search}})\), the simulations can be formulated as follows:
To simulate the Setup protocol, \(\mathcal {S}\) selects a cyclic group \(\mathbb {G}\) of prime order \(p > 2^\lambda \). Then, \(\mathcal {S}\) randomly samples \(s_i, t_i \leftarrow \mathbb {Z}_p\) for each \(i \in \{1, \cdots , l\}\), where l is determined by \(\mathcal {L}_{\textsf {Setup}}\), randomly samples \(k \leftarrow \{0,1\}^\lambda \) and computes \(h_i = g^{s_i} \cdot h^{t_i}\), where g and h are two randomly generated generators in \(\mathbb {G}\). As a result, \(\mathcal {S}\) simulates the master secret key and master public key as \(\textsf {msk} := (\textsf {sk}_{\textsf {IPFE}}=\{(s_i, t_i)\}^l_{i=1}, k)\) and \(\textsf {mpk} := (\mathbb {G}, g, h, \{h_i\}_{i=1}^{l})\), respectively.
For simulating the EDB, \(\mathcal {S}\) generates d Bloom filters \(\boldsymbol{v}_i\) of length l. These vectors are constructed to maintain dependencies with the leakage functions \(\mathcal {L}_{\textsf {Token}}\) and \(\mathcal {L}_{\textsf {Search}}\), ensuring that the adversary’s verification using simulated tokens remains valid. The adversary can only learn the length l of the vectors and the number of vectors d, as they only have access to the encrypted vectors. Finally, the simulator \(\mathcal {S}\) employs functional encryption for inner product with the mpk to encrypt the vectors and simulate the encrypted database EDB.
In the context of the Setup protocol, given the leakage information \(\mathcal {L}\), the simulator \(\mathcal {S}\) generates simulated outputs, including the encrypted database EDB, the master public key mpk, and the master secret key msk. The difference between the simulated EDB and the real-world scenario lies in the selection of \(\boldsymbol{v}_i\). Instead of obtaining \(\boldsymbol{v}_i\) based on the document mapping, \(\mathcal {S}\) selects \(\boldsymbol{v}_i\) using the leakage functions \(\mathcal {L}_{\textsf {Token}}\) and \(\mathcal {L}_{\textsf {Search}}\), followed by its encryption. The advantage of distinguishing them is negligible if functional encryption is fully secure. The simulations of the mpk and msk are equivalent with those of the real world.
In the simulation of the Token protocol, \(\mathcal {S}\) simulates tokens for Boolean queries based on the leakage function \(\mathcal {L}_{\textsf {Token}}\) and ensures that these tokens can operate on the simulated encrypted database EDB. The leakage information provided by \(\mathcal {L}_{\textsf {Token}}\) reveals the positions of non-zero elements in the vector for each Boolean query clause, as well as the number of clauses for each Boolean query. Consequently, \(\mathcal {S}\) can generate tokens that are identical to those in the real experiment. For simulating the decryption key, \(\mathcal {S}\) leverages the leaked positions information to simulate the decryption key using the Keygen algorithm of functional encryption. The advantage of \(\mathcal {A}\) in distinguishing between the real world and the ideal world becomes negligible if the functional encryption is secure.
When simulating the Search protocol, \(\mathcal {S}\) retrieves documents from the encrypted database EDB based on a given Boolean query. Upon receiving the simulated token tok, \(\mathcal {S}\) prunes the simulated EDB according to the corresponding \(\boldsymbol{\beta }\), then performs the decryption process on the pruned EDB to obtain the identifiers of documents that satisfy the query. Since both EDB and tok are simulated based on the leakage function \(\mathcal {L}\), the search process performed on the simulated token leaks the same information as \(\mathcal {L}_{\textsf {Search}}\). Consequently, \(\mathcal {A}\) cannot distinguish between the real world and the ideal world with more than negligible probability.
In the above proof, we describe a probabilistic polynomial-time simulator \(\mathcal {S}\) that simulates the real experiment by using a given leakage information from \(\mathcal {L}\). Assuming that functional encryption for inner product is secure, then our scheme SESAME+ achieves \(\mathcal {L}\)-secure, that is
Remark. Due to subtle issues from the underlying inner product functional encryption, we prove \(\textsf {SESAME}+\) with non-adaptive security, i.e., the adversary issues all queries before running the game. Designing an adaptively secure BSSE scheme with similar properties as \(\textsf {SESAME}+\) seems to require fundamentally different primitives and proof techniques, for which we leave as a future work.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, F., Ma, J., Miao, Y., Wu, P., Song, X. (2024). Beyond Volume Pattern: Storage-Efficient Boolean Searchable Symmetric Encryption with Suppressed Leakage. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14344. Springer, Cham. https://doi.org/10.1007/978-3-031-50594-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-50594-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50593-5
Online ISBN: 978-3-031-50594-2
eBook Packages: Computer ScienceComputer Science (R0)