Two-Party Decision Tree Training from Updatable Order-Revealing Encryption

Berger, Robin; Dörre, Felix; Koch, Alexander

doi:10.1007/978-3-031-54770-6_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14583))

Included in the following conference series:

International Conference on Applied Cryptography and Network Security

201 Accesses

Abstract

Running machine learning algorithms on encrypted data is a way forward to marry functionality needs common in industry with the important concerns for privacy when working with potentially sensitive data. While there is already a variety of protocols in this setting based on fully homomorphic encryption or secure multiparty computation (MPC), we are the first to propose a protocol that makes use of a specialized Order-Revealing Encryption scheme. This scheme allows to do secure comparisons on ciphertexts and update these ciphertexts to be encryptions of the same plaintexts but under a new key. We call this notion Updatable Order-Revealing Encryption (uORE) and provide a secure construction using a key-homomorphic pseudorandom function.

In a second step, we use this scheme to construct an efficient three-round protocol between two parties to compute a decision tree (or forest) on labeled data provided by both parties. The protocol is in the passively-secure setting and has some leakage on the data that arises from the comparison function on the ciphertexts. We motivate how our protocol can be compiled into an actively-secure protocol with less leakage using secure enclaves, in a graceful degradation manner, e.g. falling back to the uORE leakage, if the enclave becomes fully transparent. We also analyze the leakage of this approach, giving an upper bound on the leaked information. Analyzing the performance of our protocol shows that this approach allows us to be much more efficient (especially w.r.t. the number of rounds) than current MPC-based approaches, hence allowing for an interesting trade-off between security and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
If the assumption does not hold, an approximation of \(\mu \) with powers of \(2^{-l}\) for some l results in a distribution, that is computationally indistinguishable from uniform.
2.
https://github.com/kastel-security/ORE-Decision-Tree.
3.
The datasets are available on https://www.kaggle.com/.

References

Abadi, M., et al.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/
Abspoel, M., Escudero, D., Volgushev, N.: Secure training of decision trees with continuous attributes. Proc. Privacy Enhanc. Technol. 2021(1), 167–187 (2021). https://doi.org/10.2478/popets-2021-0010
Akavia, A., Leibovich, M., Resheff, Y.S., Ron, R., Shahar, M., Vald, M.: Privacy-preserving decision trees training and prediction. ACM Trans. Priv. Secur. 25(3), 24:1–24:30 (2022). https://doi.org/10.1145/3517197
Bentéjac, C., Csörgő, A., Martínez-Muñoz, G.: A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54(3), 1937–1967 (2021). https://doi.org/10.1007/s10462-020-09896-5
Boneh, D., Lewi, K., Raykova, M., Sahai, A., Zhandry, M., Zimmerman, J.: Semantically secure order-revealing encryption: multi-input functional encryption without obfuscation. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015, LNCS, vol. 9057, pp. 563–594. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46803-6_19
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)
Google Scholar
Canetti, R.: Universally Composable Security: A New Paradigm for Cryptographic Protocols. Cryptology ePrint Archive, Report 2000/067 (2000). https://eprint.iacr.org/2000/067
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. Adv. Neural Inf. Process. Syst. 21 (2008)
Google Scholar
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Chenette, N., Lewi, K., Weis, S.A., Wu, D.J.: Practical order-revealing encryption with limited leakage. In: Peyrin, T. (eds) FSE 2016. LNCS, vol. 9783. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-52993-5_24
Cong, K., Das, D., Park, J., Pereira, H.V.L.: SortingHat: Efficient Private Decision Tree Evaluation via Homomorphic Encryption and Transciphering, pp. 563–577 (2022). https://doi.org/10.1145/3548606.3560702
Du, W., Zhan, Z.: Building decision tree classifier on private data (2002)
Google Scholar
Betül Durak, F., DuBuisson, T.M., Cash, D.: What Else is Revealed by Order-Revealing Encryption?, pp. 1155–1166 (2016). https://doi.org/10.1145/2976749.2978379
Frery, J., et al.: Privacy-Preserving Tree-Based Inference with Fully Homomorphic Encryption. Cryptology ePrint Archive, Report 2023/258 (2023). https://eprint.iacr.org/2023/258
Grubbs, P., Sekniqi, K., Bindschaedler, V., Naveed, M., Ristenpart, T.: Leakage-abuse attacks against order-revealing encryption. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 655–672 (2017). https://doi.org/10.1109/SP.2017.44
Hamada, K., Ikarashi, D., Kikuchi, R., Chida, K.: Efficient decision tree training with new data structure for secure multi-party computation. Proc. Privacy Enhanc. Technol. 2023(1), 343–364 (2023). https://doi.org/10.56553/popets-2023-0021
de Hoogh, S., Schoenmakers, B., Chen, P., op den Akker, H.: Practical secure decision tree learning in a teletreatment application. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014, LNCS, vol. 8437, pp. 179–194. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45472-5_12
Jurado, M., Palamidessi, C., Smith, G.: A Formal Information-Theoretic Leakage Analysis of Order-Revealing Encryption, pp. 1–16 (2021). https://doi.org/10.1109/CSF51468.2021.00046
Keller, M.: MP-SPDZ: A Versatile Framework for Multi-Party Computation, pp. 1575–1590 (2020). https://doi.org/10.1145/3372297.3417872
Knott, B., Venkataraman, S., Hannun, A., Sengupta, S., Ibrahim, M., van der Maaten, L.: Crypten: secure multi-party computation meets machine learning. Adv. Neural. Inf. Process. Syst. 34, 4961–4973 (2021)
Google Scholar
Kubat, M.: An Introduction to Machine Learning. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63913-0
Lee, J.-W., et al.: Privacy-preserving machine learning with fully homomorphic encryption for deep neural network. IEEE Access 10, 30039–30054 (2022)
Google Scholar
Lewi, K., Wu, D.J.: Order-Revealing Encryption: New Constructions, Applications, and Lower Bounds, pp. 1167–1178 (2016). https://doi.org/10.1145/2976749.2978376
Li, Y., Wang, H., Zhao, Y.: Delegatable Order-Revealing Encryption, pp. 134–147 (2019). https://doi.org/10.1145/3321705.3329829
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (eds.) CRYPTO 2000. LNCS, vol. 1880. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44598-6_3
Liu, X., Deng, R.H., Raymond, K.-K., Choo, J.: An efficient privacy-preserving outsourced calculation toolkit with multiple keys. IEEE Trans. Inf. Forens. Secur. 11(11), 2401–2414 (2016). https://doi.org/10.1109/TIFS.2016.2573770
Lv, C., Wang, J., Sun, S.-F., Wang, Y., Qi, S., Chen, X.: Towards practical multi-client order-revealing encryption: improvement and application. In: IEEE Transactions on Dependable and Secure Computing (2023)
Google Scholar
Naor, M., Pinkas, B., Reingold, O.: Distributed Pseudo-random Functions and KDCs, pp. 327–346 (1999). https://doi.org/10.1007/3-540-48910-X_23
Ohrimenko, O., et al.: Oblivious Multi-party Machine Learning on Trusted Processors, pp. 619–636 (2016)
Google Scholar
Ross Quinlan, J. C4. 5: Programs for Machine Learning. Elsevier (2014)
Google Scholar
Tangirala, S.: Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm. Int. J. Adv. Comput. Sci. Appl. 11(2), 612–619 (2020)
Google Scholar
Vaidya, J., Clifton, C., Kantarcioglu, M., Scott Patterson, A.: Privacy-preserving decision trees over vertically partitioned data. ACM Trans. Knowl. Discov. Data 2(3), 14:1–14:27 (2008). https://doi.org/10.1145/1409620.1409624

Download references

Acknowledgements

We thank the anonymous reviewers for their helpful and constructive feedback. This work was supported by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF) and by KASTEL Security Research Labs. Robin Berger: This work was supported by funding from SAP Security Research. Felix Dörre: This work was supported by funding by the German Federal Ministry of Education and Research (BMBF) under the project “VE-ASCOT” (ID 16ME0275). Alexander Koch: This work was supported by the France 2030 ANR Project ANR-22-PECY-003 SecureCompute.

Author information

Authors and Affiliations

KASTEL Security Research Labs, Karlsruhe Institute of Technology, Karlsruhe, Germany
Robin Berger & Felix Dörre
CNRS and IRIF, Université Paris Cité, Paris, France
Alexander Koch

Authors

Robin Berger
View author publications
You can also search for this author in PubMed Google Scholar
Felix Dörre
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Koch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robin Berger .

Editor information

Editors and Affiliations

New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
Christina Pöpper
Radboud University Nijmegen, Nijmegen, The Netherlands
Lejla Batina

Appendices

Appendix

A A Brief Introduction to the UC Framework

In the following, we give a brief introduction to into the Universal Composability framework by Canetti [7], tailored to our usecase. As the framework is quite complex, we omit any details that are not relevant for our work.

The UC model extends the notion of the real-ideal paradigm, where the security of a protocol is defined through some ideal functionality, that captures the computation to be done and is secure by definition.

All parties are modeled as an interactive PPT machines. In addition to parties existing in the protocol, UC execution is defined with two additional entities, namely the environment and the adversary, which are modeled in the same way.

The adversary can corrupt any subset of parties. Considering passive security, the adversary can see the view of parties it corrupts (including all internal state, randomness, incoming and outgoing messages), but it cannot make corrupted parties deviate from the protocol. If it accesses variables from the internal state of a corrupted party, we say it extracts this information.

The environment selects inputs for honest parties and receives their outputs. Additionally, it can freely interact with the adversary, sending and receiving arbitrary messages.

To prove the security of a protocol, UC uses the notion of protocol emulation. We say a protocol \(\pi \) in the real world securely realizes an ideal functionality \(\mathcal {F}\) in the ideal world, if for all adversaries \(\mathcal {A}\), there exists a simulator \(\mathcal {S}\), such that no environment can distinguish between an interaction with \(\pi \) and \(\mathcal {A}\) in the real world and an interaction with \(\mathcal {F}\) and \(\mathcal {S}\) in the ideal world. This can be done by constructing a simulator for each \(\mathcal {A}\), which internally runs \(\mathcal {A}\) and translates ideal messages from/to the ideal functionality and protocol messages from/to corrupted parties. Additionally, it is sufficient to only consider the dummy adversary, that sends all protocol messages it receives to the environment and sends any messages it receives from the environment as protocol messages. In the real world, the honest parties execute the protocol and the environment can interact with them using the real adversary. In the ideal world, the input of honest parties is directly sent to the ideal functionality and the output of the ideal functionality to the honest parties is directly outputted by them.

If a protocol \(\pi \) is proven to realize an ideal functionality \(\mathcal {F}\), all security properties from \(\mathcal {F}\) carry over to \(\pi \), as this could otherwise be used to distinguish the real and ideal execution.

In UC, the universal composition theorem says that if a protocol is proven to realize an ideal functionality, it remains secure under universal composition. Therefore, it can for example be run in parallel, concurrently or as a subprotocol to other protocols without becoming insecure. If a protocol \(\pi '\) realizes a functionality \(\mathcal {F}'\) using \(\mathcal {F}\) as a building block, we say \(\pi '\) realizes \(\mathcal {F}'\) in the \(\mathcal {F}\)-hybrid model. Due to the universal composition theorem, \(\pi '\) still realizes \(\mathcal {F}'\), even after \(\mathcal {F}\) is instantiated with a protocol that securely realizes \(\mathcal {F}\).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berger, R., Dörre, F., Koch, A. (2024). Two-Party Decision Tree Training from Updatable Order-Revealing Encryption. In: Pöpper, C., Batina, L. (eds) Applied Cryptography and Network Security. ACNS 2024. Lecture Notes in Computer Science, vol 14583. Springer, Cham. https://doi.org/10.1007/978-3-031-54770-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-54770-6_12
Published: 01 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54769-0
Online ISBN: 978-3-031-54770-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Two-Party Decision Tree Training from Updatable Order-Revealing Encryption