Skip to main content

Two-Party Decision Tree Training from Updatable Order-Revealing Encryption

  • Conference paper
  • First Online:
Applied Cryptography and Network Security (ACNS 2024)

Abstract

Running machine learning algorithms on encrypted data is a way forward to marry functionality needs common in industry with the important concerns for privacy when working with potentially sensitive data. While there is already a variety of protocols in this setting based on fully homomorphic encryption or secure multiparty computation (MPC), we are the first to propose a protocol that makes use of a specialized Order-Revealing Encryption scheme. This scheme allows to do secure comparisons on ciphertexts and update these ciphertexts to be encryptions of the same plaintexts but under a new key. We call this notion Updatable Order-Revealing Encryption (uORE) and provide a secure construction using a key-homomorphic pseudorandom function.

In a second step, we use this scheme to construct an efficient three-round protocol between two parties to compute a decision tree (or forest) on labeled data provided by both parties. The protocol is in the passively-secure setting and has some leakage on the data that arises from the comparison function on the ciphertexts. We motivate how our protocol can be compiled into an actively-secure protocol with less leakage using secure enclaves, in a graceful degradation manner, e.g. falling back to the uORE leakage, if the enclave becomes fully transparent. We also analyze the leakage of this approach, giving an upper bound on the leaked information. Analyzing the performance of our protocol shows that this approach allows us to be much more efficient (especially w.r.t. the number of rounds) than current MPC-based approaches, hence allowing for an interesting trade-off between security and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    If the assumption does not hold, an approximation of \(\mu \) with powers of \(2^{-l}\) for some l results in a distribution, that is computationally indistinguishable from uniform.

  2. 2.

    https://github.com/kastel-security/ORE-Decision-Tree.

  3. 3.

    The datasets are available on https://www.kaggle.com/.

References

  1. Abadi, M., et al.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/

  2. Abspoel, M., Escudero, D., Volgushev, N.: Secure training of decision trees with continuous attributes. Proc. Privacy Enhanc. Technol. 2021(1), 167–187 (2021). https://doi.org/10.2478/popets-2021-0010

  3. Akavia, A., Leibovich, M., Resheff, Y.S., Ron, R., Shahar, M., Vald, M.: Privacy-preserving decision trees training and prediction. ACM Trans. Priv. Secur. 25(3), 24:1–24:30 (2022). https://doi.org/10.1145/3517197

  4. Bentéjac, C., Csörgő, A., Martínez-Muñoz, G.: A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54(3), 1937–1967 (2021). https://doi.org/10.1007/s10462-020-09896-5

  5. Boneh, D., Lewi, K., Raykova, M., Sahai, A., Zhandry, M., Zimmerman, J.: Semantically secure order-revealing encryption: multi-input functional encryption without obfuscation. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015, LNCS, vol. 9057, pp. 563–594. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46803-6_19

  6. Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)

    Google Scholar 

  7. Canetti, R.: Universally Composable Security: A New Paradigm for Cryptographic Protocols. Cryptology ePrint Archive, Report 2000/067 (2000). https://eprint.iacr.org/2000/067

  8. Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. Adv. Neural Inf. Process. Syst. 21 (2008)

    Google Scholar 

  9. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)

    Google Scholar 

  10. Chenette, N., Lewi, K., Weis, S.A., Wu, D.J.: Practical order-revealing encryption with limited leakage. In: Peyrin, T. (eds) FSE 2016. LNCS, vol. 9783. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-52993-5_24

  11. Cong, K., Das, D., Park, J., Pereira, H.V.L.: SortingHat: Efficient Private Decision Tree Evaluation via Homomorphic Encryption and Transciphering, pp. 563–577 (2022). https://doi.org/10.1145/3548606.3560702

  12. Du, W., Zhan, Z.: Building decision tree classifier on private data (2002)

    Google Scholar 

  13. Betül Durak, F., DuBuisson, T.M., Cash, D.: What Else is Revealed by Order-Revealing Encryption?, pp. 1155–1166 (2016). https://doi.org/10.1145/2976749.2978379

  14. Frery, J., et al.: Privacy-Preserving Tree-Based Inference with Fully Homomorphic Encryption. Cryptology ePrint Archive, Report 2023/258 (2023). https://eprint.iacr.org/2023/258

  15. Grubbs, P., Sekniqi, K., Bindschaedler, V., Naveed, M., Ristenpart, T.: Leakage-abuse attacks against order-revealing encryption. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 655–672 (2017). https://doi.org/10.1109/SP.2017.44

  16. Hamada, K., Ikarashi, D., Kikuchi, R., Chida, K.: Efficient decision tree training with new data structure for secure multi-party computation. Proc. Privacy Enhanc. Technol. 2023(1), 343–364 (2023). https://doi.org/10.56553/popets-2023-0021

  17. de Hoogh, S., Schoenmakers, B., Chen, P., op den Akker, H.: Practical secure decision tree learning in a teletreatment application. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014, LNCS, vol. 8437, pp. 179–194. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45472-5_12

  18. Jurado, M., Palamidessi, C., Smith, G.: A Formal Information-Theoretic Leakage Analysis of Order-Revealing Encryption, pp. 1–16 (2021). https://doi.org/10.1109/CSF51468.2021.00046

  19. Keller, M.: MP-SPDZ: A Versatile Framework for Multi-Party Computation, pp. 1575–1590 (2020). https://doi.org/10.1145/3372297.3417872

  20. Knott, B., Venkataraman, S., Hannun, A., Sengupta, S., Ibrahim, M., van der Maaten, L.: Crypten: secure multi-party computation meets machine learning. Adv. Neural. Inf. Process. Syst. 34, 4961–4973 (2021)

    Google Scholar 

  21. Kubat, M.: An Introduction to Machine Learning. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63913-0

  22. Lee, J.-W., et al.: Privacy-preserving machine learning with fully homomorphic encryption for deep neural network. IEEE Access 10, 30039–30054 (2022)

    Google Scholar 

  23. Lewi, K., Wu, D.J.: Order-Revealing Encryption: New Constructions, Applications, and Lower Bounds, pp. 1167–1178 (2016). https://doi.org/10.1145/2976749.2978376

  24. Li, Y., Wang, H., Zhao, Y.: Delegatable Order-Revealing Encryption, pp. 134–147 (2019). https://doi.org/10.1145/3321705.3329829

  25. Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (eds.) CRYPTO 2000. LNCS, vol. 1880. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44598-6_3

  26. Liu, X., Deng, R.H., Raymond, K.-K., Choo, J.: An efficient privacy-preserving outsourced calculation toolkit with multiple keys. IEEE Trans. Inf. Forens. Secur. 11(11), 2401–2414 (2016). https://doi.org/10.1109/TIFS.2016.2573770

  27. Lv, C., Wang, J., Sun, S.-F., Wang, Y., Qi, S., Chen, X.: Towards practical multi-client order-revealing encryption: improvement and application. In: IEEE Transactions on Dependable and Secure Computing (2023)

    Google Scholar 

  28. Naor, M., Pinkas, B., Reingold, O.: Distributed Pseudo-random Functions and KDCs, pp. 327–346 (1999). https://doi.org/10.1007/3-540-48910-X_23

  29. Ohrimenko, O., et al.: Oblivious Multi-party Machine Learning on Trusted Processors, pp. 619–636 (2016)

    Google Scholar 

  30. Ross Quinlan, J. C4. 5: Programs for Machine Learning. Elsevier (2014)

    Google Scholar 

  31. Tangirala, S.: Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm. Int. J. Adv. Comput. Sci. Appl. 11(2), 612–619 (2020)

    Google Scholar 

  32. Vaidya, J., Clifton, C., Kantarcioglu, M., Scott Patterson, A.: Privacy-preserving decision trees over vertically partitioned data. ACM Trans. Knowl. Discov. Data 2(3), 14:1–14:27 (2008). https://doi.org/10.1145/1409620.1409624

Download references

Acknowledgements

We thank the anonymous reviewers for their helpful and constructive feedback. This work was supported by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF) and by KASTEL Security Research Labs. Robin Berger: This work was supported by funding from SAP Security Research. Felix Dörre: This work was supported by funding by the German Federal Ministry of Education and Research (BMBF) under the project “VE-ASCOT” (ID 16ME0275). Alexander Koch: This work was supported by the France 2030 ANR Project ANR-22-PECY-003 SecureCompute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robin Berger .

Editor information

Editors and Affiliations

Appendices

Appendix

A A Brief Introduction to the UC Framework

In the following, we give a brief introduction to into the Universal Composability framework by Canetti [7], tailored to our usecase. As the framework is quite complex, we omit any details that are not relevant for our work.

The UC model extends the notion of the real-ideal paradigm, where the security of a protocol is defined through some ideal functionality, that captures the computation to be done and is secure by definition.

All parties are modeled as an interactive PPT machines. In addition to parties existing in the protocol, UC execution is defined with two additional entities, namely the environment and the adversary, which are modeled in the same way.

The adversary can corrupt any subset of parties. Considering passive security, the adversary can see the view of parties it corrupts (including all internal state, randomness, incoming and outgoing messages), but it cannot make corrupted parties deviate from the protocol. If it accesses variables from the internal state of a corrupted party, we say it extracts this information.

The environment selects inputs for honest parties and receives their outputs. Additionally, it can freely interact with the adversary, sending and receiving arbitrary messages.

To prove the security of a protocol, UC uses the notion of protocol emulation. We say a protocol \(\pi \) in the real world securely realizes an ideal functionality \(\mathcal {F}\) in the ideal world, if for all adversaries \(\mathcal {A}\), there exists a simulator \(\mathcal {S}\), such that no environment can distinguish between an interaction with \(\pi \) and \(\mathcal {A}\) in the real world and an interaction with \(\mathcal {F}\) and \(\mathcal {S}\) in the ideal world. This can be done by constructing a simulator for each \(\mathcal {A}\), which internally runs \(\mathcal {A}\) and translates ideal messages from/to the ideal functionality and protocol messages from/to corrupted parties. Additionally, it is sufficient to only consider the dummy adversary, that sends all protocol messages it receives to the environment and sends any messages it receives from the environment as protocol messages. In the real world, the honest parties execute the protocol and the environment can interact with them using the real adversary. In the ideal world, the input of honest parties is directly sent to the ideal functionality and the output of the ideal functionality to the honest parties is directly outputted by them.

If a protocol \(\pi \) is proven to realize an ideal functionality \(\mathcal {F}\), all security properties from \(\mathcal {F}\) carry over to \(\pi \), as this could otherwise be used to distinguish the real and ideal execution.

In UC, the universal composition theorem says that if a protocol is proven to realize an ideal functionality, it remains secure under universal composition. Therefore, it can for example be run in parallel, concurrently or as a subprotocol to other protocols without becoming insecure. If a protocol \(\pi '\) realizes a functionality \(\mathcal {F}'\) using \(\mathcal {F}\) as a building block, we say \(\pi '\) realizes \(\mathcal {F}'\) in the \(\mathcal {F}\)-hybrid model. Due to the universal composition theorem, \(\pi '\) still realizes \(\mathcal {F}'\), even after \(\mathcal {F}\) is instantiated with a protocol that securely realizes \(\mathcal {F}\).

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Berger, R., Dörre, F., Koch, A. (2024). Two-Party Decision Tree Training from Updatable Order-Revealing Encryption. In: Pöpper, C., Batina, L. (eds) Applied Cryptography and Network Security. ACNS 2024. Lecture Notes in Computer Science, vol 14583. Springer, Cham. https://doi.org/10.1007/978-3-031-54770-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54770-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54769-0

  • Online ISBN: 978-3-031-54770-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics