Quantile estimation for encrypted data

Park, Minje; Kim, Jaeseon; Shin, Sungchul; Park, Cheolwoo; Jeon, Jong-June; Kwon, SoonSun; Choi, Hosik

doi:10.1007/s10489-023-04837-5

Quantile estimation for encrypted data

Published: 28 July 2023

Volume 53, pages 24782–24791, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Minje Park¹,
Jaeseon Kim¹,
Sungchul Shin¹,
Cheolwoo Park²,
Jong-June Jeon³,
SoonSun Kwon⁴ &
…
Hosik Choi ORCID: orcid.org/0000-0003-0589-8043⁵

209 Accesses
Explore all metrics

Abstract

As data-based studies continue to increase, the need for privacy protection has become a crucial issue. One proposed solution to address this obstacle is homomorphic encryption (HE); however, the complexity of handling ciphertexts used in HE poses a serious challenge due to the extended calculation time of elementary operations. As a result, it has much more complex than handling plaintexts, limiting various subsequent data analyses. This paper proposes a quantile estimation method for encrypted data, where quantiles are core statistics for understanding the data distribution in statistical analysis. We developed an HE-friendly method for large homomorphic encrypted data using an approximate quantile loss function. Numerical studies show that the proposed method significantly improves the calculation time for simulated and real homomorphically encrypted data. Specifically, the proposed method takes approximately 26 minutes for calculating a dataset of four million, which is about 14 times faster than the sorting method. Furthermore, we applied the proposed method to construct boxplots for homomorphically encrypted data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Secure Statistical Analysis Using RLWE-Based Homomorphic Encryption

Homomorphic Encryption

Performance of hierarchical transforms in homomorphic encryption: a case study on logistic regression inference

Article 13 June 2023

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Adamović S, Miškovic V, Maček N, Milosavljević M, Šarac M, Saračević M, Gnjatović M (2020) An efficient novel approach for iris recognition based on stylometric features and machine learning techniques. Future Gener Comput Syst 107:144–157
Article Google Scholar
Assran, M. and Rabbat, M. (2020). On the convergence of nesterov’s accelerated gradient method in stochastic settings. In Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org
Ben-Haim Y, Tom-Tov E (2010) A streaming parallel decision tree algorithm. J Mach Learn Res. 11:849–872
MathSciNet MATH Google Scholar
Brakerski Z, Gentry C, Vaikuntanathan V (2012) (leveled) fully homomorphic encryption without bootstrapping. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12. Association for Computing Machinery, New York, pp 309–325
Chapter Google Scholar
Breckling J, Chambers R (1988) M-quantiles. Biometrika 75(4):761–771
Article Google Scholar
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. Association for Computing Machinery, New York, pp 785–794
Chapter Google Scholar
Cheon JH, Kim A, Kim M, Song Y (2017) Homomorphic encryption for arithmetic of approximate numbers. In: Takagi T, Peyrin T (eds) Advances in Cryptology - ASIACRYPT 2017. Cham. Springer International Publishing, pp 409–437
Chapter Google Scholar
Cheon JH, Kim W, Park JH (2022) Efficient homomorphic evaluation on large intervals. IEEE 17:2553–2568
Google Scholar
Chillotti I, Gama N, Georgieva M, Izabachène M (2016) Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In: Cheon JH, Takagi T (eds) Advances in Cryptology – ASIACRYPT 2016. Springer Berlin Heidelberg, Berlin, Heidelberg
MATH Google Scholar
Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Halevi S, Rabin T (eds) Theory of Cryptography. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 265–284
Chapter Google Scholar
Fan J, Vercauteren F (2012) Somewhat practical fully homomorphic encryption. IACR Cryptol. ePrint Arch. 2012:144
Google Scholar
Flanders H (1973) Differentiation under the integral sign. The American Mathematical Monthly 80(6):615–627
Article MathSciNet MATH Google Scholar
Gentry C (2009) Fully homomorphic encryption using ideal lattices. Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, STOC ’09. Association for Computing Machinery, New York, pp 169–178
Chapter Google Scholar
Huang H, Wang Y, Zong H (2022) Support vector machine classification over encrypted data. App Intell 52(6):5938–5948
Article Google Scholar
Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50
Article MathSciNet MATH Google Scholar
Lee J-W, Kang H, Lee Y, Choi W, Eom J, Deryabin M, Lee E, Lee J, Yoo D, Kim Y-S, No J-S (2022) Privacy-preserving machine learning with fully homomorphic encryption for deep neural network. IEEE Access 10:30039–30054
Lee Y-J, Mangasarian OL (2001) Ssvm: A smooth support vector machine for classification. Comput Opt Appl 20(1):5–22
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Indust Appl Math 11(2):431–441
Nakatani T, Huang S-T, Arden B, Tripathi S (1989) K-way bitonic sort. IEEE Trans Comput 38(2):283–288
Article MathSciNet MATH Google Scholar
Nesterov, Y. (2004). Introductory Lectures on Convex Optimization: A Basic Course. Springer Publishing Company, Incorporated, 1 edition
Norton RM (1984) The double exponential distribution: Using calculus to find a maximum likelihood estimator. Am Stat 38(2):135–136
Google Scholar
Rivest RL, Adleman L, Dertouzos ML (1978) On data banks and privacy homomorphisms. Academia Press, Foundations of Secure Computation, pp 169–179
Google Scholar
Rubin DB (1993) Statistical disclosure limitation. J Off. Stat 9:461–468
Google Scholar
Saračević MH, Adamović SZ, Miškovic VA, Elhoseny M, Maček ND, Selim MM, Shankar K (2021) Data encryption for internet of things applications based on catalan objects and two combinatorial structures. IEEE Transact Reliabil 70(2):819–830
Article Google Scholar
Tukey, J. W. (1977). Exploratory Data Analysis. Pearson
Zheng S (2011) Gradient descent algorithms for quantile regression with smooth approximation. Int J Mach Learn Cybernet 2(3):191–207
Article Google Scholar

Download references

Acknowledgements

Hosik Choi was supported by the 2020 Research Fund of the University of Seoul. Cheolwoo Park’s work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2021R1A2C1092925, NRF-2022M3J6A1063021). Sungchul Shin’s work was supported by Ministry of Land, Infrastructure and Transport (RS-2022-0144012).

Author information

Authors and Affiliations

CryptoLab, Seoul, South Korea
Minje Park, Jaeseon Kim & Sungchul Shin
Department of Mathematical Sciences, KAIST, Daejeon, South Korea
Cheolwoo Park
Department of Statistics, University of Seoul, Seoul, South Korea
Jong-June Jeon
Department of Mathematics/Artificial Intelligence, Ajou University, Suwon-si, South Korea
SoonSun Kwon
Department of Urban Big Data Convergence, University of Seoul, Seoul, South Korea
Hosik Choi

Authors

Minje Park
View author publications
You can also search for this author in PubMed Google Scholar
Jaeseon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sungchul Shin
View author publications
You can also search for this author in PubMed Google Scholar
Cheolwoo Park
View author publications
You can also search for this author in PubMed Google Scholar
Jong-June Jeon
View author publications
You can also search for this author in PubMed Google Scholar
SoonSun Kwon
View author publications
You can also search for this author in PubMed Google Scholar
Hosik Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hosik Choi.

Ethics declarations

Conflicts of interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Ethical standard

The authors state that this research complies with ethical standards. This research does not involve either human participants or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The pseudo-code of the NAG method is described in Algorithm 1.

The pseudo-code of the Newton method is described in Algorithm 2.

1.1 Choosing the best epoch

As it can be challenging to monitor the objective value in HE, the number of epochs should be predefined appropriately. To determine the epoch number and examine the convergence of objective values, we utilized two methods with randomly generated data in plaintext. In Fig. 3, we present the trajectories of the epoch absolute errors at \(\tau =0.25\) for the NAG method (blue, dashed line) with \(\alpha =0.1\) and \(\eta =0.4\), and the Newton method (red, solid line) with \(\alpha =0.1\). Across multiple s values, the NAG method converges within 10 epochs, while the Newton method converges within 5 epochs, indicating that Newton converges faster than NAG. Based on this observation, we used 20 and 10 epochs for NAG and Newton, respectively, in our numerical study to ensure convergence. We find that this strategy works well.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Park, M., Kim, J., Shin, S. et al. Quantile estimation for encrypted data. Appl Intell 53, 24782–24791 (2023). https://doi.org/10.1007/s10489-023-04837-5

Download citation

Accepted: 23 June 2023
Published: 28 July 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10489-023-04837-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantile estimation for encrypted data

Abstract

Access this article

Similar content being viewed by others