Abstract
Running machine learning algorithms on encrypted data is a way forward to marry functionality needs common in industry with the important concerns for privacy when working with potentially sensitive data. While there is already a variety of protocols in this setting based on fully homomorphic encryption or secure multiparty computation (MPC), we are the first to propose a protocol that makes use of a specialized Order-Revealing Encryption scheme. This scheme allows to do secure comparisons on ciphertexts and update these ciphertexts to be encryptions of the same plaintexts but under a new key. We call this notion Updatable Order-Revealing Encryption (uORE) and provide a secure construction using a key-homomorphic pseudorandom function.
In a second step, we use this scheme to construct an efficient three-round protocol between two parties to compute a decision tree (or forest) on labeled data provided by both parties. The protocol is in the passively-secure setting and has some leakage on the data that arises from the comparison function on the ciphertexts. We motivate how our protocol can be compiled into an actively-secure protocol with less leakage using secure enclaves, in a graceful degradation manner, e.g. falling back to the uORE leakage, if the enclave becomes fully transparent. We also analyze the leakage of this approach, giving an upper bound on the leaked information. Analyzing the performance of our protocol shows that this approach allows us to be much more efficient (especially w.r.t. the number of rounds) than current MPC-based approaches, hence allowing for an interesting trade-off between security and performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
If the assumption does not hold, an approximation of \(\mu \) with powers of \(2^{-l}\) for some l results in a distribution, that is computationally indistinguishable from uniform.
- 2.
- 3.
The datasets are available on https://www.kaggle.com/.
References
Abadi, M., et al.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/
Abspoel, M., Escudero, D., Volgushev, N.: Secure training of decision trees with continuous attributes. Proc. Privacy Enhanc. Technol. 2021(1), 167–187 (2021). https://doi.org/10.2478/popets-2021-0010
Akavia, A., Leibovich, M., Resheff, Y.S., Ron, R., Shahar, M., Vald, M.: Privacy-preserving decision trees training and prediction. ACM Trans. Priv. Secur. 25(3), 24:1–24:30 (2022). https://doi.org/10.1145/3517197
Bentéjac, C., Csörgő, A., Martínez-Muñoz, G.: A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54(3), 1937–1967 (2021). https://doi.org/10.1007/s10462-020-09896-5
Boneh, D., Lewi, K., Raykova, M., Sahai, A., Zhandry, M., Zimmerman, J.: Semantically secure order-revealing encryption: multi-input functional encryption without obfuscation. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015, LNCS, vol. 9057, pp. 563–594. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46803-6_19
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)
Canetti, R.: Universally Composable Security: A New Paradigm for Cryptographic Protocols. Cryptology ePrint Archive, Report 2000/067 (2000). https://eprint.iacr.org/2000/067
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. Adv. Neural Inf. Process. Syst. 21 (2008)
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Chenette, N., Lewi, K., Weis, S.A., Wu, D.J.: Practical order-revealing encryption with limited leakage. In: Peyrin, T. (eds) FSE 2016. LNCS, vol. 9783. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-52993-5_24
Cong, K., Das, D., Park, J., Pereira, H.V.L.: SortingHat: Efficient Private Decision Tree Evaluation via Homomorphic Encryption and Transciphering, pp. 563–577 (2022). https://doi.org/10.1145/3548606.3560702
Du, W., Zhan, Z.: Building decision tree classifier on private data (2002)
Betül Durak, F., DuBuisson, T.M., Cash, D.: What Else is Revealed by Order-Revealing Encryption?, pp. 1155–1166 (2016). https://doi.org/10.1145/2976749.2978379
Frery, J., et al.: Privacy-Preserving Tree-Based Inference with Fully Homomorphic Encryption. Cryptology ePrint Archive, Report 2023/258 (2023). https://eprint.iacr.org/2023/258
Grubbs, P., Sekniqi, K., Bindschaedler, V., Naveed, M., Ristenpart, T.: Leakage-abuse attacks against order-revealing encryption. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 655–672 (2017). https://doi.org/10.1109/SP.2017.44
Hamada, K., Ikarashi, D., Kikuchi, R., Chida, K.: Efficient decision tree training with new data structure for secure multi-party computation. Proc. Privacy Enhanc. Technol. 2023(1), 343–364 (2023). https://doi.org/10.56553/popets-2023-0021
de Hoogh, S., Schoenmakers, B., Chen, P., op den Akker, H.: Practical secure decision tree learning in a teletreatment application. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014, LNCS, vol. 8437, pp. 179–194. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45472-5_12
Jurado, M., Palamidessi, C., Smith, G.: A Formal Information-Theoretic Leakage Analysis of Order-Revealing Encryption, pp. 1–16 (2021). https://doi.org/10.1109/CSF51468.2021.00046
Keller, M.: MP-SPDZ: A Versatile Framework for Multi-Party Computation, pp. 1575–1590 (2020). https://doi.org/10.1145/3372297.3417872
Knott, B., Venkataraman, S., Hannun, A., Sengupta, S., Ibrahim, M., van der Maaten, L.: Crypten: secure multi-party computation meets machine learning. Adv. Neural. Inf. Process. Syst. 34, 4961–4973 (2021)
Kubat, M.: An Introduction to Machine Learning. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63913-0
Lee, J.-W., et al.: Privacy-preserving machine learning with fully homomorphic encryption for deep neural network. IEEE Access 10, 30039–30054 (2022)
Lewi, K., Wu, D.J.: Order-Revealing Encryption: New Constructions, Applications, and Lower Bounds, pp. 1167–1178 (2016). https://doi.org/10.1145/2976749.2978376
Li, Y., Wang, H., Zhao, Y.: Delegatable Order-Revealing Encryption, pp. 134–147 (2019). https://doi.org/10.1145/3321705.3329829
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (eds.) CRYPTO 2000. LNCS, vol. 1880. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44598-6_3
Liu, X., Deng, R.H., Raymond, K.-K., Choo, J.: An efficient privacy-preserving outsourced calculation toolkit with multiple keys. IEEE Trans. Inf. Forens. Secur. 11(11), 2401–2414 (2016). https://doi.org/10.1109/TIFS.2016.2573770
Lv, C., Wang, J., Sun, S.-F., Wang, Y., Qi, S., Chen, X.: Towards practical multi-client order-revealing encryption: improvement and application. In: IEEE Transactions on Dependable and Secure Computing (2023)
Naor, M., Pinkas, B., Reingold, O.: Distributed Pseudo-random Functions and KDCs, pp. 327–346 (1999). https://doi.org/10.1007/3-540-48910-X_23
Ohrimenko, O., et al.: Oblivious Multi-party Machine Learning on Trusted Processors, pp. 619–636 (2016)
Ross Quinlan, J. C4. 5: Programs for Machine Learning. Elsevier (2014)
Tangirala, S.: Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm. Int. J. Adv. Comput. Sci. Appl. 11(2), 612–619 (2020)
Vaidya, J., Clifton, C., Kantarcioglu, M., Scott Patterson, A.: Privacy-preserving decision trees over vertically partitioned data. ACM Trans. Knowl. Discov. Data 2(3), 14:1–14:27 (2008). https://doi.org/10.1145/1409620.1409624
Acknowledgements
We thank the anonymous reviewers for their helpful and constructive feedback. This work was supported by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF) and by KASTEL Security Research Labs. Robin Berger: This work was supported by funding from SAP Security Research. Felix Dörre: This work was supported by funding by the German Federal Ministry of Education and Research (BMBF) under the project “VE-ASCOT” (ID 16ME0275). Alexander Koch: This work was supported by the France 2030 ANR Project ANR-22-PECY-003 SecureCompute.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix
A A Brief Introduction to the UC Framework
In the following, we give a brief introduction to into the Universal Composability framework by Canetti [7], tailored to our usecase. As the framework is quite complex, we omit any details that are not relevant for our work.
The UC model extends the notion of the real-ideal paradigm, where the security of a protocol is defined through some ideal functionality, that captures the computation to be done and is secure by definition.
All parties are modeled as an interactive PPT machines. In addition to parties existing in the protocol, UC execution is defined with two additional entities, namely the environment and the adversary, which are modeled in the same way.
The adversary can corrupt any subset of parties. Considering passive security, the adversary can see the view of parties it corrupts (including all internal state, randomness, incoming and outgoing messages), but it cannot make corrupted parties deviate from the protocol. If it accesses variables from the internal state of a corrupted party, we say it extracts this information.
The environment selects inputs for honest parties and receives their outputs. Additionally, it can freely interact with the adversary, sending and receiving arbitrary messages.
To prove the security of a protocol, UC uses the notion of protocol emulation. We say a protocol \(\pi \) in the real world securely realizes an ideal functionality \(\mathcal {F}\) in the ideal world, if for all adversaries \(\mathcal {A}\), there exists a simulator \(\mathcal {S}\), such that no environment can distinguish between an interaction with \(\pi \) and \(\mathcal {A}\) in the real world and an interaction with \(\mathcal {F}\) and \(\mathcal {S}\) in the ideal world. This can be done by constructing a simulator for each \(\mathcal {A}\), which internally runs \(\mathcal {A}\) and translates ideal messages from/to the ideal functionality and protocol messages from/to corrupted parties. Additionally, it is sufficient to only consider the dummy adversary, that sends all protocol messages it receives to the environment and sends any messages it receives from the environment as protocol messages. In the real world, the honest parties execute the protocol and the environment can interact with them using the real adversary. In the ideal world, the input of honest parties is directly sent to the ideal functionality and the output of the ideal functionality to the honest parties is directly outputted by them.
If a protocol \(\pi \) is proven to realize an ideal functionality \(\mathcal {F}\), all security properties from \(\mathcal {F}\) carry over to \(\pi \), as this could otherwise be used to distinguish the real and ideal execution.
In UC, the universal composition theorem says that if a protocol is proven to realize an ideal functionality, it remains secure under universal composition. Therefore, it can for example be run in parallel, concurrently or as a subprotocol to other protocols without becoming insecure. If a protocol \(\pi '\) realizes a functionality \(\mathcal {F}'\) using \(\mathcal {F}\) as a building block, we say \(\pi '\) realizes \(\mathcal {F}'\) in the \(\mathcal {F}\)-hybrid model. Due to the universal composition theorem, \(\pi '\) still realizes \(\mathcal {F}'\), even after \(\mathcal {F}\) is instantiated with a protocol that securely realizes \(\mathcal {F}\).
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Berger, R., Dörre, F., Koch, A. (2024). Two-Party Decision Tree Training from Updatable Order-Revealing Encryption. In: Pöpper, C., Batina, L. (eds) Applied Cryptography and Network Security. ACNS 2024. Lecture Notes in Computer Science, vol 14583. Springer, Cham. https://doi.org/10.1007/978-3-031-54770-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-54770-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54769-0
Online ISBN: 978-3-031-54770-6
eBook Packages: Computer ScienceComputer Science (R0)